On the Topic Discovery Using Query Logs and Hyperlink
نویسندگان
چکیده
With the rapid growth of the World Wide Web, the amount of information in the Web has spawned on an unpredictable scale. Recently, most researchers attempt to use conventional information retrieval techniques to classify the search results. But not like the traditional document, the web page has distinct characteristics of its own. Therefore some researchers have begun to exploit the hyperlinks between Web pages. In this paper, we propose a topic discovery algorithm, which combines the query log and the hyperlink analysis. We use the query log to find the representative Web pages with respect to users’ endorsements, and combine a link-based clustering algorithm to cluster the similar topics. Each Web page is ranked according to search engine users’ endorsements and web creators’ endorsements. The experimental results show that our method performs better than the pure hyperlink analysis algorithm (ATD) in terms of topics discrimination and topic quality.
منابع مشابه
Measure of systemic risk in the interbank market in Iran by buffer capital and hyperlink-induced topic search algorithm
Considering that the interbank market is considered as a night market to provide short-term liquidity to banks, one of the most important risks in this market - due to the short-term nature of transactions in this market - is systemic risk. Exercising this risk cycle will have devastating effects on monetary policymakers, such as the 2007-2009 crisis. In this study, first, the buffer capital ...
متن کاملAnalysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)
Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis. Methods: The method of this research is log anal...
متن کاملMining Associative Relations from Website Logs and their Application to Context-Dependent Retrieval Using Spreading Activation
We have devized a methodology that mines sequential navigation patterns from a website's logs to enable us to identify the most signi cant associative links in the networks. Spreading activation can then be applied to the generated network of weighted hyperlinks enabling the content-dependent, semantic retrieval of nodes in the network. This approach to information retrieval avoids many of the ...
متن کاملDiscovering and understanding word level user intent in Web search queries
Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue t...
متن کاملQuery Topic Classification and Sociology of Web Query Logs
In the paper, the objects, tasks, and a general procedure of the sociological analysis of Web search engine query logs are described and illustrated by a methodologically complete study of the cross-nation search image changes based on two-year spaced query logs of the national search audience.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006